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Abstract — We consider the discrete, time- varying broadcast 
channel with memory under the assumption that the channel 
states belong to a set of finite cardinality. We first define the phys- 
ically degraded finite-state broadcast channel for which we derive 
the capacity region. We then define the stochastically degraded 
finite-state broadcast channel and derive the capacity region for 
this scenario as well. In both scenarios we consider the non- 
indecomposable finite-state channel as well as the indecomposable 
one. 

I. Introduction 

The broadcast channel (BC) was inttoduced by Cover in 
1972. In this scenario a single sender transmits three messages, 
one common and two private, to two receivers over a channel 
defined by { X,p(y, z\x), y x Zj. Here, X is the channel 
input from the transmitter, Y is the channel output at Rxi 
and Z is the channel output at RX2. In the years following its 
introduction the study of the BC focused on memoryless sce- 
narios, i.e., when the probability of a block of n transmissions 
is given by p{y n , z n \x n ) = Yl" = iP(Vi, Zi\xi). In recent years, 
models of time-varying broadcast channels with memory have 
attracted a lot of attention, especially Gaussian BCs. This was 
motivated by the proliferation of mobile communications, for 
which the channel is subject to time-varying correlated fading. 
The correlation of the fading process innoduces memory in the 
BC. The fading BC is one instance of the general BC with 
channel states. While fading BCs have received considerable 
attention, discrete, time-varying BCs with channel states have 
not been well studied. A notable exception is the degraded 
arbitrarily varying BC (DAVBC) considered in [2] and [3]. 
In [2] DAVBCs with causal and non-causal side information 
at the transmitter were considered. The states are assumed 
i.i.d. and the channel is memoryless: p(y n ,z n \x n ,s n ) = 
YVi=iP(Vii z i\ x ii s i)- In [3], the capacity region for DAVBCs 
with causal side information at the transmitter and non-causal 
side information at the good receiver was derived. In [3] the 
state distribution is general and is not subject to the i.i.d. 
restriction, but the channel outputs, given the states and the 
channel inputs are again memoryless. The general, discrete 
BC with i.i.d. states non-causally known at the transmitter 
was considered in [4]. 

The arbitrarily varying channel (AVC) is one model for 
a time-varying channel with states. It models a memoryless 
channel whose law varies in time in an arbitrary manner. The 
state transitions are independent of the channel inputs and 
outputs. In this work we study the discrete time- varying BC 
with memory in the framework of finite-state channels (FSCs). 
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In contrast to the AVC, in the FSC both the channel output 
and the current state depend on both the channel input and the 
previous state. 

The finite-state channel model was used to model point-to- 
point channel variations as early as 1953 [1]. This channel is 
characterized by the distribution p(y, s\x, s') where S is the 
current state and S' is the previous state. For a block of n 
transmissions, the p.m.f. at the i'th symbol time satisfies 

p(Vi, Si\x\ s 1 " 1 ,y l ~ 1 , s ) =p(yi,s i \x i ,s i - 1 ), (1) 
where sq is the state of the channel when transmission began. 
Equation (Hi implies that Si-i contains all the history infor- 
mation for time i. Recently, the finite-state multiple-access 
channel was studied in [6]. This scenario is characterized by 
the channel distribution p(y, s\xi, X2, s'), and the work in [6] 
also considered the effect of feedback on the rates. 

In the present work we study the finite-state broadcast 
channel (FSBC). Here, the channel from the transmitter to 
the receivers is governed by a state sequence that depends on 
the channel inputs, outputs and previous states. The way these 
symbols interact with each other is captured by the transition 
function p(y, z, s\x, s'). 

Main Contributions and Organization 

In this paper we consider for the first time the capacity 
of the FSBC. Here, there is a unique aspect not encountered 
in the point-to-point and the MAC counterparts, namely the 
application of superposition coding to the FSC. We initially 
define the physically degraded FSBC and find the capacity 
region of this scenario. We then define the stochastically de- 
graded FSBC and give examples of communication scenarios 
represented by this model. We derive the capacity region for 
this channel as well. 

The rest of this paper is organized as follows: Section ILT1 
introduces the channel model. Section [Till presents a summary 
of the results together with a discussion. Lastly, Section [TV] 
outlines the proof of the capacity region for the physically 
degraded FSBC. 

II. Channel Model and Definitions 

First, a word about notation. In the following we denote 
random variables with upper case letters, e.g. X, Y, and their 
realizations with lower case letters x, y. A random variable 
(RV) X takes values in a set X. We use ||,Y|| to denote the 
cardinality of a finite, discrete set X, X n to denote the n-fold 
Cartesian product of X, and px{x) to denote the probability 
mass function (p.m.f.) of a discrete RV X on X. For brevity 
we may omit the subscript X when it is obvious from the 
context. We use Px\y(x\y) to denote the conditional p.m.f. of 
X given Y, We denote vectors with boldface letters, e.g. x, y; 



the i'th element of a vector x is denoted with Xi and we use 
x\ where i < j to denote the vector (xi, Xj+i, Xj—i, Xj); 
x- 7 is short form notation for x\, and x = x n , A vector 
of n random variables is denoted by X", and similarly we 
define X\ = (Xi,Xi+i,...,Xj-i,Xj) for i < j. We use 
H(-) to denote the entropy of a discrete random variable and 
/(•;•) to denote the mutual information between two random 
variables, as defined in [7, Chapter 2]. /(•; -) q denotes the 
mutual information evaluated with a p.m.f. q on the random 
variables. Finally, co 1Z denotes the convex hull of the set 1Z. 

Definition 1: The discrete, finite-state broadcast channel is 
defined by the triplet { XxS, p(y, z, s\x, s'), yxZxS} where 
X is the input symbol, Y and Z are the output symbols, S' 
is the state of the channel at the end of the previous symbol 
transmission and S is the state of the channel at the end of 
the current symbol transmission. S, X, y and Z are discrete 
alphabets of finite cardinalities. The p.m.f of a block of n 
transmissions is 

in n n n \ 

p{y ,z ,s ,x \s ) 

n 

= ^p{yi,Zi,s i ,x i \y l ~ 1 ,z % ~ 1 ,s l ~ 1 ,x l ~ 1 ,s Q ) 

i=l 
n 

= Y\_p(xi\x l ~ 1 )p(y i , Zi, s i \y % ~ 1 ,z % ~ 1 ,s % ~ 1 ,x\ s ) 

i=l 

n 

= p(x n ) JJp(j/i, Zi,Si\Xi,3i-i), (2) 
i=l 

where sq is the initial channel state. Here (a) captures the fact 
that given the symbols at time i are independent of the 

past. 

Definition 2: The FSBC is called physically degraded if its 
p.m.f. satisfies 

p{y l \x\y l ~ 1 1 z l - 1 1 s Q ) = p(y J |x l ,y I - 1 ,s ), (3a) 
p{z i \x\y\z l ~ 1 ,s n ) = p(z i \y\z l ~ 1 ,s Q ). (3b) 
Condition ( f3ab captures the intuitive notion of degradedness, 
namely that Z 1 ^ 1 is a degraded version of thus it does 

not add information when Y' 1 ^ 1 is given. Note that in the 
memoryless case this condition is not necessary as, given Xi, 
Yi is independent of the history. Condition ( |3bl follows from 
the standard notion of degradedness. 

Using conditions (l3at and (l3bl we obtain (when 
p(y n ,x n \s ) > 0) 

p(z n \y n ,x n ,s ) 
= p(z n ,y n ,x n \s ) 
p(y n ,x n \s ) 

Iir= l P(Vi > x Ay l ~ 1 1 xi ~ 1 > s o ) 

nr=iP(^iy i_i : a:i_:L )nr=iP(yii^" i ' a; % s o) 

(£) nLi N'" 1 ) IIILi yiV 1 ' 1 ,y l ~ x ,x\ s ) 

n^=iP( a; «i a;i_l )n"=iP(2/ti^~ i ,»sso) 

(j n^i p(y t \y l -\x\ So ) l\LiP(zj\z l -\y l > *\ *o) 
U r LiP(yi\y l ~^ x ^ s o) 

n 

( = 5 JJp^l^-S^so), (4) 

i=l 



where (a) is because there is no feedback, (b) follows from 
d3at and (c) follows from d3bt . We conclude that when ([3]) 
holds, p(z n \y n , x n , so) = p(z n \y n , so). Hence, 

p(y",z"|x",s ) =p(y"|.T",soMz"|y",so). (5) 

Note that (|4]i shows how to obtain p(z n \y n , x n , so) in a 
causal manner. Also note that Z n is a degraded version of 
Y n but still depends on the state sequence (i.e. degraded- 
ness does not eliminate the memory). A special case of the 
physically degraded FSBC occurs when in d3bl it holds that 
p(z i \x i ,y i ,z i ~ 1 ,s ) =p{z l \y i ). Hence, 

71 

p(z n \y n ,x n ,s )=p(z n \y n ) = l[p(z i \y i ). (6) 

i=i 

Equation © is similar to the definition of degradedness for 
the DAVBC used in [2]. 

Definition 3: The FSBC is called stochastically degraded if 
there exists a p.m.f. p{z\y) such that 

p(z,s\x,s') = ^2p(y,s\x,s')p(z\y,s,x,s') 

y 

= ^2p(y,s\x,s')p(z\y). (7) 

y 

Note that when © holds then 

p(z n \x n ,s ) = ^(As'VNso) 

71 

S 71 i=l 
n 

= 5Z P(2/»> s il x i) s i-l)p(*i|j/i) 

= z^ J z^ J \\p{yt,s t \x i ,s l ^i)p{z i \y l ) 

5" y i=l 

n 

n 

= ^p(2/ n |x",s )[]^|y,), (8) 

yn i=l 

where (a) and (b) follow from (0. 

Definition [3] does not constitute only a mathematical con- 
venience, but represents a physical scenario. For example, 
consider a scenario in which a base station transmits to two 
mobile units, located approximately on the same line-of-sight 
from the base station (BS), as indicated by the dashed line 
in Figure Q] Let the BS transmit a BPSK signal and let 
the received signals be subject to additive Gaussian thermal 
noise due to the receivers' front-ends. When decoding at 
the receivers takes place after a hard threshold at zero, the 
resulting scenario is the binary symmetric broadcast channel 
(BSBC). Denote the situation where there is no traffic on the 
road between the BS and the mobiles as state A. Let the 
channel BS-Rxi have a crossover probability ti{A) =0.1 
and the channel BS-RX2 have a crossover probability £2(^4) = 
0.15. This can be represented as a stochastically degraded BC 
with a degrading channel whose crossover probability is 

ei 2 04) = T—tt- = 0-0625. 

1 — zCi 



Assume that on occasions, a car passes on the road between the 
BS and the mobiles. This causes attenuation in both channels 
simultaneously. Call this state B and let e\(B) = 0.18 and 
e 2 (B) = 0.22. Again we have e 12 (B) = 0.062EQ Hence, the 
degrading channel is the same for both states, irrespective of 
the state sequence (in this example the state sequence repre- 
sents the traffic pattern, and is not an independent sequence). 
This satisfies condition ©. 
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Fig. 1. A degraded FSBC scenario: the mobile units are located on the same 
line-of-sight from the base-station (indicated by the dashed line). Passing cars 
affect the channels to both mobile units simultaneously. 

More generally, we can define a set of states for this 
scenario, e.g. S = {1, 2, K}, with y = Z = {0, 1} and 

p(zi, Si\yi, Si-t) = p(si\si-i)p(zi\yi, Sj) 

v(z\v s-k) - I £l2(fc) l ~ V 
P{ lV ' ' - \l-e 12 (k) ,z = y 

£i2(k) G (0, 0.5),fc G S. This results in a collection of 
physically degraded BSBCs that can give more flexibility in 
modeling the scenario of Figure Q] as the degrading channel 
may depend on the state. However, for this reason, this model 
does not satisfy our definition of stochastic degradedness in 
Definition [3] 

Definition 4: (see [5, Section 4.6]) The FSBC is called 
indecomposable if for every e > there exists iVo(e) such 
that for all n > N n (e), \p(s n \x, sn) — p(s n \x, s' )\ < e, for all 
s n , x, and initial states sq and s' . 

Definition 5: An (Po, Pi, P 2 , Ti) deterministic code for the 
FSBC consists of three message sets, Mo = {l,2, ...,2 niio }, 
Mi = {l,2,...,2 nfll } andX 2 = { 1, 2, 2 nR * }, and three 
mappings (f,g y ,g x ) sucn that 

f :M xMixM 2 ^ X n (9) 
is the encoder and 

g y : y n ^M Q xM u 
g z : Z n ^ M x M 2 , 

are the decoders. Here, Mo is the set of common messages 
and M\ and M 2 are the sets of private messages to Rxi and 
RX2 respectively. 

1 The scenario parameters assumed in this example are: Two-ray propa- 
gation model, Rx decoding scheme is maximum-likelihood, Base station Tx 
power = 30 dBm, Base station antenna gain = 10 dBi, Rx antenna gain = 
dBi, Rx noise floor = —90 dBm, Base station antenna height = 10 m, Rx 
antenna height = 1.5 m, BS-Rxi distance = 7.2 Rm and BS-RX2 distance = 
8 Km. We also assume a passing car increases the path attenuation by 3 dB. 



Note that we assume no knowledge of the states at the 
transmitter and receivers. 

Definition 6: The average probability of error of a code for 
the FSBC is given by pj = max So e $ Pe (so), where, 

Pj")( So )=Pr ( 9y (Y n )^(M ,M 1 )or 

g z (Z n ) ^(M ,M 2 )\s ), 

where each of the messages Mq G A^o. Mi G Mi and M 2 G 
M 2 is selected independently and uniformly. 

Definition 7: A rate triplet (i?o, Pi, R2) is called achiev- 
able for the FSBC if for every e > and S > there exists an 
n(e, S) such that for all n > n(e, 6) an (Rq — 5, R\ — S, R 2 — 
S, n) code with p]"' < e can be constructed. 

Definition 8: The capacity region of the FSBC is the con- 
vex hull of all achievable rate triplets. 

III. Main Results and Discussion 
Define first 

fli,n(p,*o) = ±I(X»;Y»\U»,a ) p - ] ?2*M 



n 

R2,n(p,s ) = -I(U n ;Z n \s ) p 



n n 
The main result is stated in the following theorem, whose 
proof is outlined in Section IIVI 

Theorem 1: Let Q n be the set of all joint distributions on 
(x" =1 £/j, X n ) such that the cardinality of the random vector 
U n is bounded by || X? =1 Ui\\ < min{\\X\\,\\y\\,\\Z\\} n . 
For the physically degraded FSBC of Definition [2] define the 
region lZ n (so) as 

n n (s ) =co|j|(Po,Pi,P 2 ) : P > 0,Pi > 0,P 2 > 0, 

Pi < Rl,n{q n ,So), P0 + P2 < R2,n(<ln,So) >.(10) 



The capacity region of the physically degraded FSBC is given 
by 

C pd = lim Pi fcn(so), (11) 
n — ^00 1 1 

s es 

and the limit exists. 

Since the capacity of the broadcast channel depends only on 
the conditional marginals p(y n \x n , sq) and p(z n \x n , sq) (see 
[7, Chapter 14.6]) then the capacity region of the stochastically 
degraded FSBC is the same as the corresponding physically 
degraded FSBC: 

Corollary 1: For the stochastically degraded FSBC of Def- 
inition |2 the capacity region is given by Theorem Q] where 
p(z\s, y, x, s') is replaced by p(z\y) that satisfies equation Q. 

When the FSBC is indecomposable, then the effect of the 
initial state fades away as n increases. Therefore we have the 
following corollary: 

Corollary 2: For the indecomposable physically degraded 
FSBC, the capacity region is given by Theorem [7] For the 
indecomposable stochastically degraded FSBC, the capacity 
region is obtained from Corollary\l\ In both cases the param- 
eter so in Pi ira (<Znj So) an d R 2 , n (qn, So) and the intersection 
over S in the expression for C p d are omitted. 



Proof outline: Loosely speaking, the corollary is true since 
for n large enough the effect of the initial state fades away. 
Therefore, for asymptotically large n the maximum over all 
initial states sq G S equals the minimum. 

Discussion 

First, note that if lirrin^oo lZ n (so) exists for all sq G S then 
the capacity region ( fTTT i can be written as 

Cp d = lim Pi 1Z n (so) = Pi lim TZ n (s ). 

n — *oo 11 11 n — >oo 

so£S s eS 

Here, (a) is permitted because S is finite. Thus, the capacity 
region can be viewed as the intersection of all the capacity 
regions obtained when the initial state is known at the receivers 
(but not at the transmitter). We also note the following 
conclusions: 

1) Since the limit of the region exists, then as n increases, 
optimizing the code will result in better performance (which 
is not guaranteed when the limits cannot be shown to exist, 
consider for example a non-stationary channel with noise that 
oscillates with time). 

2) The codebook structure that achieves capacity is a 
superposition codebook. This introduces a structural constraint 
when optimizing the codebook for achieving the maximum 
rate triplets. 

3) The auxiliary RV U n introduces difficulties mainly in 
places where we need to rely on the its cardinality. This is 
because we cannot translate the bound on the cardinality of 
U n into a bound on the cardinality of a subset of U n , In 
particular, we cannot use the cardinality of U n when deriving 
the capacity region for the indecomposable FSBC. Moreover, 
letting n = mi + m<x, then from Equation (HJ we have that 

p(z m > , y mi , s™ 1 \x n , s ) = p(z mi , y mi , s mi , So ). 
But because p(x mi \u n ) ^ p(x mi \u mi ) then 

V {z m - , y mi , s mi \u n , s ) + P (z mi , y mi , s mi \u m - , s ). 

This is a major difference from the point-to-point and the MAC 
channels. Consider, for example, the expression 

max (max -I(U n ; Z n \s ) + \m&x -I(X n ; Y n \U n , s'A. 

(12) 

While in the MAC and the point-to-point channels the corre- 
sponding expressions converge for all channels, for the FSBC 
(TTZb can be shown to converge only for the indecomposable 
case. Therefore, using superposition coding, the channel be- 
tween U n and (Y n ,Z n ) is fundamentally different from the 
channel between X n and (Y n ,Z n ). This is in contrast also 
to the discrete, memoryless BC. 

IV. Proof Outline 

In the derivation we focus on the physically degraded FSBC. 
The derivation requires only that condition (0 holds. In the 
derivation we shall consider only the two private messages 
case as the common message can be incorporated by splitting 
the rate to Rx2 into private and common rates, as in [7, 
Theorem 14.6.4]. 




Fig. 2. Lines bounding the achievable regions for the FSBC for initial states 
sa and sb, and the resulting region of positive error exponents. 

A. Achievability Theorem 

Due to space limitations we omit the details of the achiev- 
ability proof and give only the conclusion. For complete details 
see [8]. Define first 

F n (X) — max < min i?2,n(p, s ) + A min Ri, n (p, s' ) >. 

Following [9, Section 2], the boundary of the region of positive 
error exponents for a given n can be written as 

R?(R 7 t)= o mf i {F„(A) - Ai?,}. (13) 

This characterization is illustrated in Figure [2] 

In the achievability proof we show that for a given 
p(u n ,x n ), when transmitting at the positive rate pair 
( min^gs i?i,„(p, s' Q ),min SoeS R 2i n(p, s )), then the error 
exponent is positive and bounded away from zero. Hence, the 
probability of error can be made less than any arbitrary e > 
by taking a block length Kn with a large enough integer K. 

Furthermore, in section llV-Dl we show that the largest region 
is obtained by taking the limit 

R 2 (Ri) = inf ( Urn F n (X) -XrA, (14) 

0<A<1 [ rwoo J 

and that this limit exists and is finite. The fact that the limit 
exists and is finite implies that we can approach the rates of 
Theorem Q] arbitrarily close by taking n large enough, thus by 
Definition [7] these rates are achievable. 

Before considering the converse we discuss the cardinality 
of the auxiliary RV U n , as the evaluation of R^(R^) of (O 
depends on the existence of such a bound. 

B. Cardinality Bounds 

From the derivation in [9], it follows that maximizing the 
region 1Z n (so) of Equation ( fTOb over all joint distributions 
p(u n ,x n ), can be carried out while the cardinality of the 
auxiliary random variable U n is bounded by 

|| x^UiW < wm{\\X\\, \\Z\\} n . (15) 

Now note that from ( fTTT ). the achievable region for a fixed 
n is given by the intersection f\ es T^-n(so)- As for each 
'Rn(so), so G S we have the same cardinality bound, then 
this bound also holds for maximizing the intersection of the 
regions TZ n (s ), s Q G S. 



C. Converse 

Lemma 1: If for some e > 0, A > 0, 

R 2 + XRi > lim F„(A) + e, 

n—>-oo 

then there exists a pair of initial states Sq and s f such that 

P^(s )R 2 +X (P^(s'»)Ri) > e - ^(1+A)(1 + log 2 ||5||). 
The implication of this inequality, as explained in [9, Section 
3], is that for n large enough the probability of error P^ 
cannot be made arbitrarily small outside the region ( fT4l >. 
Proof: From Fano's inequality we have that 

H(M 2 \Z n , s ) < P^\s Q )nR 2 + 1 (16a) 

H{M 1 \Y n ,s ) < P^(s )nRi + 1. (16b) 
Next write 

mm I{M 2 ;Z n \s ) = nR 2 - max H {M 2 \Z n , s ) (17) 

soES s £S 

min I(M X ; Y n \M 2 , s' ) = nRi - max H(M t \Y n ,M 2 ,s' ) 

> nR 1 -ma^H{Mi\Y n ,s' ).(lS) 

s' es 

Now note that 

I(M 2] Z n \s ) = H(Z n \s Q )-H(Z n \M 2 ,s ) 

= I(U n ;Z n \s ), (19) 
where Ui = M 2 , i = 1, 2, n. We also have 
I(M i; Y n \M 2 , s ' ) = H(Y n \M 2 ,s' ) - H(Y n \Mi,M 2 ,s' ) 
< H(Y n \U n ,s' )- H(Y n \X n ,U n ,s ) 
= I(X n ;Y n \U n ,s' ), (20) 
where the definition of U n satisfies the Markov relationship 
U n \s' - X n \s' - Y n \s' . Combining dHJl and (EoJ we have 
that for this choice of U n : 

min I(M 2 ; Z n \s Q ) + A min I (Mi; Y n \M 2 , s' ) 

so^S s' £S 

< mm I{U n ;Z n \s ) + Xmin I(X n ;Y n \U n ,s' ) 

<»F n (A) + (H-A)]og 2 ||S||, (21) 
since F„(A) is obtained by maximizing over all joint distribu- 
tions p(u n , x n ) subject to the cardinality constraint (1151) . which 
is also satisfied by our choice of U n . Let so,n and s' Q n be 
the maximizing states for H(M 2 \Z n ,s ) and H (Mi\Y™ , s' ) 
respectively. 

Plugging ([17) and (Qj) into d2T} yields 
nR 2 - H(M 2 \Z n , s Q . n ) + A(ni?j - H{M 1 \Y n , s' 0>n )) 
-(l + A)log 2 ||5|| <nF n (X). 

Thus, H{M 2 \Z n ,s ,„) + XH{Mx\Y n ,s' 0n ) + 

(1 + A)log 2 ||5|| > n(R 2 + XR 1 -F n (X)) > 

n (R 2 + XRi — limn^oo F n (X)) > ne. Combined with 

(1 161 . this completes the proof of the lemma. ■ 

D. Convergence 

In this subsection we show that limn^oo F n (X) exists and is 
finite for the channel under consideration, when AG [0, 1]. 

The proof of convergence extends the arguments in [5, 
Appendix 4A] to the FSBC. The main difficulty here is the 
introduction of the auxiliary RV U n and its interaction with 
the other RVs, S n ,X n , Y n and Z" . We actually show that 

lim F n (X) =supF„(A) 



which implies that the limit exists. Due to its length, the full 
proof is omitted and only the main points are highlighted. 

Let s = s o(0 minimize jI(U l ;Z l \s ) and 
s' = sg (Z) minimize )l(X l ;Y l \U\s' ), for the triplet 
(qi(u l , x l ), Sq(1), Sq(0) m at achieves the max-min solution 
for Fi(X), and let (g 2 (u m , x m ), sg(m), s^m)) achieve the 
max-min solution F m (A). Finally, let Sq(u) and So(n) be the 
states that achieve the max-min solution for F n (X). We show 
that F n (X) is sup-additive, i.e., for every integer m,l 6 [0, n] 
with n = m + I we have 

nF n (X) > IFi(X) + mF m (X). 

Sup-additivity is verified by breaking the length n expres- 
sions into expressions of length I and expressions of length 
m. The critical part here is to consider the length m sequence 
from 1+ 1 to n. Here we use the fact that given the initial state 
the channel is stationary, so p(Zp +1 ,Y l r ^_ 1 \xf +1 , si = sq) = 
p{Z^, Y^lxf = xf +1 ,s ). This, combined with the fact the 
cardinality bound depends only on the length of the sequence, 
leads to the conclusion that the joint distribution g 2 (u™, x™) 
that maximizes F m (X) will maximize the segment from I + 1 
to n (i.e. is the maximizing distribution for (UJ' +1 , X™ +1 ), with 
the same initial state). 

Additionally, both ±I(U n ; Z n \s ) and ±I(X n ; Y n \U n , s' ) 
are bounded from above, independent of n: 

±I(U n ;Z n \s a ) < log 2 \\Z\\, 

since all the Zi's are defined over the same alphabet Z, = 
Z, and similarly ±I(X n ; Y n \U n , s' ) < log 2 ||Af||. Thus, 
F„(X) < log 2 \\Z\\ + Alog 2 < oo for any A e [0,1]. 
The fact that F n (X) is bounded from above independent of 
n and is also sup-additive implies that lim n ^oo F n (X) exists 
and is finite. 

Combining the fact that the limit exists with sections IIV-AI 
ITV^Bl and ITv^Cl gives the capacity of the FSBC of Theorem Q] 
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